Expression of emotions is a crucial part of daily human communication. Emotion recognition in conversations (ERC) is an emerging field of study, where the primary task is to identify the emotion behind each utterance in a conversation. Though a lot of work has been done on ERC in the past, these works only focus on ERC in the English language, thereby ignoring any other languages. In this paper, we present Multilingual MELD (M-MELD), where we extend the Multimodal EmotionLines Dataset (MELD) \cite{poria2018meld} to 4 other languages beyond English, namely Greek, Polish, French, and Spanish. Beyond just establishing strong baselines for all of these 4 languages, we also propose a novel architecture, DiscLSTM, that uses both sequential and conversational discourse context in a conversational dialogue for ERC. Our proposed approach is computationally efficient, can transfer across languages using just a cross-lingual encoder, and achieves better performance than most uni-modal text approaches in the literature on both MELD and M-MELD. We make our data and code publicly on GitHub.
translated by 谷歌翻译
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tilde{\Omega}(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.
translated by 谷歌翻译
Machine learning (ML) has recently facilitated many advances in solving problems related to many-body physical systems. Given the intrinsic quantum nature of these problems, it is natural to speculate that quantum-enhanced machine learning will enable us to unveil even greater details than we currently have. With this motivation, this paper examines a quantum machine learning approach based on shallow variational ansatz inspired by tensor networks for supervised learning tasks. In particular, we first look at the standard image classification tasks using the Fashion-MNIST dataset and study the effect of repeating tensor network layers on ansatz's expressibility and performance. Finally, we use this strategy to tackle the problem of quantum phase recognition for the transverse-field Ising and Heisenberg spin models in one and two dimensions, where we were able to reach $\geq 98\%$ test-set accuracies with both multi-scale entanglement renormalization ansatz (MERA) and tree tensor network (TTN) inspired parametrized quantum circuits.
translated by 谷歌翻译
Soft actuators have attracted a great deal of interest in the context of rehabilitative and assistive robots for increasing safety and lowering costs as compared to rigid-body robotic systems. During actuation, soft actuators experience high levels of deformation, which can lead to microscale fractures in their elastomeric structure, which fatigues the system over time and eventually leads to macroscale damages and eventually failure. This paper reports finite element modeling (FEM) of pneu-nets at high angles, along with repetitive experimentation at high deformation rates, in order to study the effect and behavior of fatigue in soft robotic actuators, which would result in deviation from the ideal behavior. Comparing the FEM model and experimental data, we show that FEM can model the performance of the actuator before fatigue to a bending angle of 167 degrees with ~96% accuracy. We also show that the FEM model performance will drop to 80% due to fatigue after repetitive high-angle bending. The results of this paper objectively highlight the emergence of fatigue over cyclic activation of the system and the resulting deviation from the computational FEM model. Such behavior can be considered in future controllers to adapt the system with time-variable and non-autonomous response dynamics of soft robots.
translated by 谷歌翻译
As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT) rate and the Silent Data Corruption (SDC) rate, which quantify how often a device fails. Instead, this paper focuses on quantifying the DNN accuracy given that a transient error has occurred, which tells us how well a network behaves when a transient error occurs. We call this metric Resiliency Accuracy (RA). We show that existing RA formulation is fundamentally inaccurate, because it incorrectly assumes that software variables (model weights/activations) have equal faulty probability under hardware transient faults. We present an algorithm that captures the faulty probabilities of DNN variables under transient faults and, thus, provides correct RA estimations validated by hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte Carlo integration problem, and solve it using importance sampling driven by DNN specific heuristics. Using our lightweight RA estimation method, we show that transient faults lead to far greater accuracy degradation than what todays DNN resiliency tools estimate. We show how our RA estimation tool can help design more resilient DNNs by integrating it with a Network Architecture Search framework.
translated by 谷歌翻译
We present RecD (Recommendation Deduplication), a suite of end-to-end infrastructure optimizations across the Deep Learning Recommendation Model (DLRM) training pipeline. RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets. Feature duplication arises because DLRM datasets are generated from interactions. While each user session can generate multiple training samples, many features' values do not change across these samples. We demonstrate how RecD exploits this property, end-to-end, across a deployed training pipeline. RecD optimizes data generation pipelines to decrease dataset storage and preprocessing resource demands and to maximize duplication within a training batch. RecD introduces a new tensor format, InverseKeyedJaggedTensors (IKJTs), to deduplicate feature values in each batch. We show how DLRM model architectures can leverage IKJTs to drastically increase training throughput. RecD improves the training and preprocessing throughput and storage efficiency by up to 2.49x, 1.79x, and 3.71x, respectively, in an industry-scale DLRM training system.
translated by 谷歌翻译
Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard GPUs are not optimized to deploy SNNs, resulting in high energy and latency. While analog In-Memory Computing (IMC) platforms can serve as energy-efficient inference engines, they are accursed by the immense energy, latency, and area requirements of high-precision ADCs (HP-ADC), overshadowing the benefits of in-memory computations. We propose a hardware/software co-design methodology to deploy SNNs into an ADC-Less IMC architecture using sense-amplifiers as 1-bit ADCs replacing conventional HP-ADCs and alleviating the above issues. Our proposed framework incurs minimal accuracy degradation by performing hardware-aware training and is able to scale beyond simple image classification tasks to more complex sequential regression tasks. Experiments on complex tasks of optical flow estimation and gesture recognition show that progressively increasing the hardware awareness during SNN training allows the model to adapt and learn the errors due to the non-idealities associated with ADC-Less IMC. Also, the proposed ADC-Less IMC offers significant energy and latency improvements, $2-7\times$ and $8.9-24.6\times$, respectively, depending on the SNN model and the workload, compared to HP-ADC IMC.
translated by 谷歌翻译
本文研究了与可解释的AI(XAI)实践有关的两个不同但相关的问题。机器学习(ML)在金融服务中越来越重要,例如预批准,信用承销,投资以及各种前端和后端活动。机器学习可以自动检测培训数据中的非线性和相互作用,从而促进更快,更准确的信用决策。但是,机器学习模型是不透明的,难以解释,这是建立可靠技术所需的关键要素。该研究比较了各种机器学习模型,包括单个分类器(逻辑回归,决策树,LDA,QDA),异质集合(Adaboost,随机森林)和顺序神经网络。结果表明,整体分类器和神经网络的表现优于表现。此外,使用基于美国P2P贷款平台Lending Club提供的开放式访问数据集评估了两种先进的事后不可解释能力 - 石灰和外形来评估基于ML的信用评分模型。对于这项研究,我们还使用机器学习算法来开发新的投资模型,并探索可以最大化盈利能力同时最大程度地降低风险的投资组合策略。
translated by 谷歌翻译
执法和城市安全受到监视系统中的暴力事件的严重影响。尽管现代(智能)相机广泛可用且负担得起,但在大多数情况下,这种技术解决方案无能为力。此外,监测CCTV记录的人员经常显示出迟来的反应,从而导致对人和财产的灾难。因此,对迅速行动的暴力自动检测至关重要。拟议的解决方案使用了一种新颖的端到端深度学习视频视觉变压器(Vivit),可以在视频序列中熟练地辨别战斗,敌对运动和暴力事件。该研究提出了利用数据增强策略来克服较弱的电感偏见的缺点,同时在较小的培训数据集中训练视觉变压器。评估的结果随后可以发送给当地有关当局,可以分析捕获的视频。与最先进的(SOTA)相比,所提出的方法在某些具有挑战性的基准数据集上实现了吉祥的性能。
translated by 谷歌翻译
今天,参加在线论坛上的讨论非常普遍,这些讨论已经开始对在线用户的整体意见产生强大的影响。 Naturally, twisting the flow of the argument can have a strong impact on the minds of naive users, which in the long run might have socio-political ramifications, for example, winning an election or spreading targeted misinformation.因此,这些平台可能非常容易受到恶意玩家的影响,他们可能会单独采取行动,也可能是繁殖谬误的争论,并动机促进公众舆论。 AD HOMINEM论点是此类谬论中最有效的形式之一。尽管是一个简单的谬论,但它足够有效,可以在离线世界中进行公开辩论,并且可以用作阻止诽谤反对派声音的先驱。在这项工作中,我们迈出了第一步,以阐明野外Ad Hominem谬论的使用。首先,我们建立了一个具有很高准确性的强大AD HOMINEM探测器(F1超过83%,对先前的工作显示出显着改善),即使对于注释的实例构成很小一部分的数据集也是如此。然后,我们在从在线辩论论坛中收集的265k参数(创建者)中使用了我们的检测器。我们的众包调查验证了我们对创建ebate数据的野外预测(94%与手动注释相匹配)。我们的分析表明,令人惊讶的31.23%的创建ebate内容包含AD HOMINEM谬论,并且一群高度活跃的用户的同类发表了更大的AD AD本人,以抑制相反的观点。然后,我们的时间分析表明,自2016年美国总统大选以来,AD HOMINEM论点的使用量显着增加,不仅是政治等主题,而且对于科学和法律。最后,我们讨论了我们的工作的重要意义,以检测和防御AD HOMINEM谬论。
translated by 谷歌翻译